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Abstract — The high-speed dynamic True Single Phase 
Clock (TSPC) logic design style offer fully pipelined logic 
circuits using only one clock signal, which makes clock 
distribution simple and compact. The conversion of simple 
logic gates to pipelined TSPC logic gates increases transistor 
count since standard cell implementation for a logic function 
uses both N-block as well as P-block to remove transparency 
between pipelined stages, despite the fact that logic 
functions are only implemented with N-block. In this paper 
we present a technique in which a TSPC logic cell are 
implemented both as cell N and cell P cells, where each cell 
block is performing a logic function along with only one 
type latching operation. Such an implementation allows a 
systematic approach for converting un-pipelined circuits to 
fully pipelined circuits. The alternate cell N and cell P 
behaves as dynamic register and removes transparency 
between pipelined stages. The appropriate numbers of 
dynamic registers are used to equalize stage delays for all 
paths and to remove transparency between pipelined stages. 
The modified TSPC implementation shows almost 40% to 
50% reduction in transistor counts and almost 50% 
reduction in clock cycles as compared to worst-case 
standard TSPC implementation. The worst-case standard 
TSPC implementation assumes that no logic merging is 
possible with P-block, since input to any cell appears after 
different cycle delays. The modified TSPC logic circuit 
implementation preserves all the advantages of standard 
TSPC logic implementation and in addition offers the 
reduced circuit complexity due to reduced transistor count 
per logic cell. The proposed logic design style reduces layout 
area and average power consumption as compared to the 
standard TSPC pipelined circuit implementation. 
Index Terms — TSPC, dynamic, pipelined, high speed, 
transistor count, circuit complexity 

I. Introduction 

Conventional static CMOS circuit implementation 
cannot fulfill the requirements of high throughput 
pipelined digital designs therefore dynamic logic circuits 
can be used to speed-up digital designs. The high-speed 
dynamic True Single Phase Clock (TSPC) logic design 
style offer fully pipelined logic circuits using only one 
clock signal, which makes clock distribution simple and 
compact [1], [2], [8]. The simple clock distribution of the 
TSPC circuits reduces the layout area for clock lines, no 
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clock skew problem and increases the speed [1], [2], 
[8]. The TSPC Logic style shows a robust cell 
characteristic and therefore the standard cell 
implementation is possible. The TSPC-cell consists of 
one N-block and one P-block each driven by single clock 
signal (if) [1], [2], [5], [7]. In standard TSPC logic style 
the implementation of simple gates like AND, OR, XOR 
increases the transistor count since each logic cell 
implementation use both N-block together with P-block 
to remove transparency between the pipelined stages, 
while logic function is implemented only with N-block 
[3], [5]; therefore it makes the P-block redundant, since it 
perform no logic function other than latching. Although 
logic merging is possible [1], [2] in standard TSPC cell 
by implementing logic functions with both the N-block 
and P-block, but such a logic merging is inefficient, since 
in practical cases inputs to a gate may appear after 
different cycle delays, which may not allow effective 
logic merging. The implementation complexity also 
increases due to reduced freedom for logic merging; since 
it requires the careful and tedious mapping of logic 
functions with both the blocks. The worst-case standard 
TSPC logic implementation assumes that each cell logic 
function is implemented with N-block only, while P- 
block do not implement any logic but provide dynamic 
latching to avoid the transparency between pipelined 
stages (i.e. no logic merging possible). Furthermore, the 
pipelined circuit requires the stage registers (REG) to 
equalize the stage delays for all paths, which are 
implemented as one LATCH_N and LATCH_P in 
cascade. In standard TSPC logic cells if logic merging is 
not efficient, it increases pipelined stages that need the 
extra register operation, which increases the transistor 
count [3], [6]. In this paper we present a technique in 
which a TSPC logic cells are implemented as cell_N and 
cell_P, where each cell is performing a logic function 
along with only one type latching operation (either N- 
latch or P-latch). The transparency between the pipelined 
stages can be removed by alternate cell_N and cell_P 
assignment. Since each cell performs a logic operation 
along with latching, such an implementation will reduce 
the number of redundant transistors due to efficient 
merging of logic. 

The circuit complexity gets reduced due to decrease in 
transistor count per logic cell. Since each cell performs 
the logic function, therefore the burden of stage register 
operations to equalize the stage delays for all paths gets 
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reduced by almost 50% as compared to worst-case 
standard TSPC logic implementation. The proposed 
technique reduces the latency and increases the 
throughput of digital pipelined circuit implementation 
since the number of clock cycle gets reduced by almost 
50%. The proposed logic design style offers advantages 
like pipelining, low average power, low latency and high 
throughput. A 3-bit pipelined adder is designed using 
modified TSPC logic style and compared with standard 
TSPC logic circuit implementation. Logic 
implementation using the proposed logic design style is 
described in Section II; circuit implementation is 
discussed in Section III. Section IV shows the 
comparison between standard TSPC and modified TSPC 
and section V concludes the paper. 



II. 



Logic Implementation 



The proposed modified TSPC implementation allows 
the flexibility of implementing logic function with both 
N-block as well as with P-block, hence converting the 
simpler logic cells to pipelined logic cells shows large 
reduction in the transistor count as compared to standard 
TSPC logic design style. The simple logic cells like 
AND, OR, XOR can be designed using both N-block 
logic cell as well as P-block logic cell, thus the logic cells 
produced for AND function will be AND_N & AND_P; 
OR, XOR in similar way produces OR_N & OR_P, 
XOR_N & XOR_P respectively. A simple AND2 gate 
can be implemented in two different ways (AND2_N and 
AND2_P) as shown in Figure 1. The operations of 
AND_N and AND2_P are shown in table I. REG can be 
split into LATCH_N and LATCH_P. The latching can be 
performed by these two latches are shown in Figure 2. 
These latches are used to equalize cycle delays, since the 
alternate LATCH_N and LATCH_P form a dynamic 
register REG [4]. Table II explains operation of both the 
latches. For a pipelined circuit implementation the cell-N 
and cell-P must be kept one after other in cascaded 
manner with appropriate latching to equalize stage delays 
for all paths as explained in section III. The operation of 
cell_N/cell_P and LATCH_N/LATCH_P are summarized 
in table III. 



TABLE I. 



Operation of AND N and Operation of AND P 
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Table II. Operations of LATCH_N and LATCH_P 



Operation of LATCH_N 
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Table III. Summary of operation 
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Figure 2. LATCH_N and LATCH_P Cells (CLK= 



been designed using XOR2, AND2 and OR3 gates, where 
the first stage is implemented in cell_N logic while 
second stage is implemented as cell_P logic. A half cycle 
latch LATCH_N is used in parallel with cell_N logic to 
equalize stage delay, thus the circuit will take one clock 
cycle to produce correct output. In the standard TSPC 
implementation XOR2_N and XOR2_P are replaced by 
XOR2, LATCH_N is replaced by REG, AND_N is 
replaced by AND2, and OR3_P is replaced by OR3 
therefore it takes 2 clock cycle to produce correct output. 
Comparison of transistor counts in two different 
implementations is shown in table IV. Following the 
hierarchical design approach, a 3-bit ripple pipelined 
adder can be designed using above 1-bit adder and 
keeping latches at appropriate positions as shown in 
Figure 4. The circuit takes three (cell_N+cell_P) delay; 
since one (cell_N+cell_P) forms dynamic register (REG) 
and takes one clock cycle to produce output therefore 
output would be available after 3-clock cycle delay for 
any applied input. The standard TSPC implementation 
uses REG instead of LATCH N or LATCH P and 



therefore requires almost double transistor and 6-clock 
cycle delay for any applied input. The comparison of 
transistor counts in two different implementations is 
shown in table IV. Both the circuit implementations were 
verified using transistor level simulator (T-spice) from 
Tanner Research Incorporation utilizing parameters of 0.5 
urn technology. The simulation result shows correct 
pipelined circuit behavior. The supply voltage V DD was 
kept at 3.3V. 
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Figure 3. 1-bit pipelined adder circuit 



IV. Comparison 
The comparison of transistor counts for 3 -bit 
pipelined adder implementation results utilizing standard 
TSPC logic design style and modified TSPC logic design 
style is shown in Table IV. The comparison result shows 
almost 46 to 47 % less transistor count for modified 
TSPC logic design style. The other advantage in modified 
TSPC logic implementation is the reduction in number of 
clock cycles; the output is available after 3-clock cycle as 
compared to the 6-clock cycle required in standard TSPC 
implementation. The existence of the P block requires 
PMOS devices need to be sized up to attain the same 
gate's performance; because of its low mobility (u. p ) 
devices as compared to NMOS devices (|i n ). 



IV. Conclusion 
The paper presented a new modified true single-phase 
clock logic design style to implement the pipelined circuit 
for high-speed real time application. The modified 
implementation scheme reduces number of transistor 
count by 40% to 50% as compared to standard TSPC 
implementation. 
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Figure 4. 3-bit pipelined adder circuit 



TABLE IV. COMPARISON TABLE 
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Modified TSPC 
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The proposed logic style offers almost 50% reduction 
in the number of clock cycle therefore the pipelined logic 
circuit can be designed to offer reduced latency and 
increased throughput as compared to standard TSPC 
circuit implementation. The proposed modified TSPC 
logic design style is best suited for converting un- 
pipelined circuits to fully pipelined circuits with effective 
logic merging. The circuit complexity reduces due to 
reduced number of transistors per logic cell. The 
proposed logic style implementation offers compact 
layout area and the reduced average power consumption 
for pipelined circuit implementations. Use of single clock 
makes clock distribution simple and compact. 
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